Syntactic heads in statistical language modeling

نویسندگان

  • Jun Wu
  • Sanjeev Khudanpur
چکیده

The use of syntactic structure in general and heads of syntactic constituents in particular has recently been shown to be beneecial for statistical language modeling. This paper provides an insightful analysis of this role of syntactic structure. It is shown that the predictive power of syntactic heads is mostly complementary to the predictive power of N-grams: they help in positions where an intervening phrase or clause separates the heads from the word being predicted, making the N-gram a poor predictor. Furthermore , a signiicant portion of this predictive power comes in the form of a more sophisticated back-oo eeect via the syntactic categories (nonterminal tags) of the heads. Finally, it is shown that using the categories of the syntactic heads is better than using the categories (part-of-speech tags) of the two preceding words, connrming that it is the syntactic analysis and not just the improved back-oo strategy which leads to improvements over N-gram models. Experimental results for perplexity and word error rate are presented on the Switchboard corpus to support this analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shallow Semantics with Shallow Syntax

Assigning semantic roles to the constituents of a natural language sentence is an important first step in translating natural language into a logical form for further processing. I present a statistical classifier which can perform this task using minimal syntactic cues. I use the syntactic and the semantic head of each constituent as the only features and present simple rules for extracting th...

متن کامل

Statistical Language Modeling with Performance Benchmarks using Various Levels of Syntactic-Semantic Information

Statistical language models using n-gram approach have been under the criticism of neglecting large-span syntactic-semantic information that influences the choice of the next word in a language. One of the approaches that helped recently is the use of latent semantic analysis to capture the semantic fabric of the document and enhance the n-gram model. Similarly there have been some approaches t...

متن کامل

Perception Development of Complex Syntactic Construction in Children with Hearing Impairment

Objectives: Auditory perception or hearing ability is critical for children in acquisition of language and speech hence hearing loss has different effects on individuals’ linguistic perception, and also on their functions. It seems that deaf people suffer from language and speech impairments such as in perception of complex linguistic constructions. This research was aimed to study the pe...

متن کامل

Gender-Based investigation of the Syntactic Development of Iranian EFL Learners: A Focus on Processabilty Theory

Pienemann (1998, 2015) put forward Processability Theory to enlighten why language learners follow definite developmental paths. The aim of the present study was to run a comparative investigation into the difficulty order of different grammatical structures for male and female Iranian EFL learners predicted by Processability Theory. 185 Iranian university students took part in this study. They...

متن کامل

Title of dissertation : DECISION TREE - BASED SYNTACTIC LANGUAGE MODELING

Title of dissertation: DECISION TREE-BASED SYNTACTIC LANGUAGE MODELING Denis Filimonov, Doctor of Philosophy, 2011 Dissertation directed by: Dr. Mary Harper Department of Computer Science Dr. Philip Resnik Department of Linguistics Statistical Language Modeling is an integral part of many natural language processing applications, such as Automatic Speech Recognition (ASR) and Machine Translatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000